--- title: first snkrfinder.model a keywords: fastai sidebar: home_sidebar nb_path: "nbs/02a_model.ipynb" ---
{% raw %}
{% endraw %} {% raw %}
{% endraw %}

Part 2: tune MobileNet_v2 feature extractor to my space

{% raw %}
print(Path().cwd())
os.chdir(L_ROOT)
print(Path().cwd())
/home/ergonyc/Projects/Project2.0/snkr-finder/nbs
/home/ergonyc/Projects/Project2.0/snkr-finder
{% endraw %} {% raw %}
filename = ZAPPOS_DF_SIMPLIFIED # "zappos-50k-simplified"
df = pd.read_pickle(f"data/{filename}.pkl")
{% endraw %}

mobilenet v2

We will use Google's mobileNetV2 trained on ImageNet loaded from torchvision to embed our sneakers into a feature space.

decapitate mobilnet_v2 (neuter)

Because we simply want to collect the features output from the model rather than do classification (or some other decision) I replaced the clasiffier head with a simple identity mapper. The simple Identity nn.Module class makes this simple.

Finally, since we are calculating the features, or embedding over 30k images with the net lets load the computations onto our GPU. We need to remember to do this in evaluation mode so Batch Norm / Dropout layers are disabled. [I forgot to do this initally and lost hours trying to figure out why i wasn't getting consistent results]. Setting param.requires_grad = False saves us memory since we aren't going to fit any weights for now, and protects us in case we forget to do a with torch.no_grad() before inference.

Later when we use the full FastAI API this should all be handled elegantly behind the scenes

{% raw %}
{% endraw %} {% raw %}

get_cuda[source]

get_cuda()

try to load onto GPU

{% endraw %} {% raw %}

get_cpu[source]

get_cpu()

set device to cpu.

{% endraw %} {% raw %}
{% endraw %} {% raw %}

get_mnetV2_feature_net[source]

get_mnetV2_feature_net(to_cuda=False)

{% endraw %} {% raw %}
{% endraw %} {% raw %}

get_feats_dataloaders[source]

get_feats_dataloaders(data, batch_size, size, device)

{% endraw %} {% raw %}
{% endraw %} {% raw %}

get_all_feats0[source]

get_all_feats0(dls, model, to_df=False)

expect dls to give us sorted (alphabetical)

{% endraw %} {% raw %}

get_all_feats[source]

get_all_feats(dls, conv_net)

{% endraw %} {% raw %}
sz=IMG_SIZES['small']
device = get_cuda()
batch_size = 128

#dls = get_feats_dataloaders(df,128, IMG_SIZE, get_cuda())
dls = get_feats_dataloaders(df,batch_size,sz,device)
model = get_mnetV2_feature_net(to_cuda=True)
df_f = get_all_feats(dls,model)
df_f.head()
path classes features
0 Boots/Ankle/A. Testoni/7965307.5291.jpg 0 [1.9596949, 0.91932136, 0.67717063, 1.1698788, 0.91729313, 0.12664592, 2.0470042, 1.6652943, 0.007458243, 0.0, 0.88955, 0.0, 0.0, 0.08179863, 1.42104, 0.18352564, 1.466933, 0.09476035, 1.2419412, 0.10888626, 0.8694837, 0.2977686, 0.8238469, 1.8043277, 0.023746258, 0.0, 2.669394, 0.9604887, 0.67463136, 2.4954314, 0.26203445, 1.8727907, 2.8077662, 0.0, 2.020404, 1.0696063, 0.55576074, 0.23424914, 1.8183663, 0.5816444, 3.5778742, 1.0479962, 0.40553424, 0.16061784, 1.7855675, 0.98880947, 0.06479494, 0.11203614, 0.0, 2.4086215, 0.0069243787, 0.057906527, 2.1099272, 0.0, 0.17170256, 1.7314386, 2...
1 Boots/Ankle/A. Testoni/7999255.363731.jpg 1 [0.0, 0.39735663, 0.05327405, 0.012619175, 1.382733, 0.0, 0.22705668, 0.0, 0.34412655, 0.025879573, 0.4261208, 2.6372364, 0.6075392, 0.002119399, 0.31160071, 0.0, 0.0, 0.16088045, 0.09003007, 0.8797595, 0.69514155, 1.0136069, 0.0, 0.46500212, 0.39483, 0.14675963, 1.0362822, 0.22008514, 0.0, 1.5351506, 0.9941812, 0.086842395, 3.1033688, 0.038136136, 0.9855597, 0.90458345, 0.5759543, 1.9182389, 0.0, 0.800809, 2.8951347, 2.3592372, 0.8275615, 1.2117064, 1.4250538, 0.118127756, 0.32974303, 0.06252143, 0.0, 0.7625757, 0.0, 1.1913197, 0.40655863, 0.0, 0.40836537, 1.3093466, 1.3200923, 0.0, 0.650...
2 Boots/Ankle/A. Testoni/8000978.364150.jpg 2 [0.005565671, 0.0, 0.21764839, 0.0056486325, 0.40146974, 0.0, 2.4023218, 0.0, 0.16196719, 0.0, 1.6152484, 0.42293844, 0.41474032, 0.20707399, 1.0115616, 0.0, 0.57421094, 0.42114913, 0.0, 0.08185306, 0.0, 0.36237356, 0.2305991, 0.4151259, 0.0, 0.10232711, 0.033240966, 0.0, 0.03922116, 1.4982154, 0.19737452, 1.2351397, 1.8234835, 0.0, 2.5732946, 0.79983544, 0.089553416, 2.6426082, 0.0, 0.15771051, 2.5570078, 2.586741, 1.1185422, 1.1072648, 0.8513116, 0.019243114, 0.7665231, 0.13402946, 0.0, 1.5671434, 0.006259647, 0.84074914, 0.0, 0.09688102, 0.9656724, 0.33671874, 0.36259165, 0.0, 2.1202717...
3 Boots/Ankle/AIGLE/8113228.1897.jpg 3 [0.115499705, 0.11528222, 0.006443279, 0.16754538, 0.7286024, 0.0, 3.0897799, 0.16322118, 0.002549061, 0.0128531875, 0.29116696, 0.0, 0.047397755, 0.08124481, 0.0, 0.14968784, 0.46674734, 0.37539676, 0.0, 1.979223, 0.056159757, 1.275553, 0.053019106, 0.3672771, 0.8091415, 0.00037304213, 0.036411006, 0.022591043, 1.4983549, 1.510517, 1.1573904, 0.48774213, 1.2969055, 0.01712038, 3.6611078, 0.5569439, 0.46278602, 2.3371863, 0.55478317, 0.44731188, 1.1695428, 1.1673155, 0.0, 0.23782621, 0.18047279, 0.10236487, 2.2528937, 1.6540146, 0.0, 2.0198858, 0.0, 0.69105303, 0.0, 0.0, 0.42495155, 0.8306...
4 Boots/Ankle/AIGLE/8113228.1912.jpg 4 [0.012816949, 0.11727771, 0.0, 0.1007479, 0.94233644, 0.0, 1.7602819, 0.115136325, 0.012680926, 0.009729699, 0.23919532, 0.53213644, 0.0, 0.17211361, 0.0, 0.01024299, 0.2928055, 0.32529086, 0.0, 1.7539325, 0.0, 1.5707643, 0.0, 0.7032588, 0.9458007, 0.0, 0.0, 0.06332283, 1.012034, 1.4271433, 0.9277754, 0.20370254, 0.66167593, 0.078751765, 3.3283658, 0.07726567, 0.46649575, 1.9873013, 0.046124667, 0.39887354, 1.5906644, 1.0953429, 0.0, 0.48753002, 0.0853204, 0.13520738, 1.2228394, 0.6702342, 0.0, 1.3219945, 0.0, 0.87645036, 0.0, 0.0, 0.27290767, 1.392535, 0.8794986, 0.0, 0.3727527, 0.0, 0.55...
{% endraw %} {% raw %}
{% endraw %} {% raw %}

save_featsXsize[source]

save_featsXsize(im_sizes={'small': 128, 'medium': 160, 'large': 224})

{% endraw %} {% raw %}
{% endraw %} {% raw %}

collate_featsXsize[source]

collate_featsXsize(df, dump=True)

merge the features from small/med/large

{% endraw %} {% raw %}
        
save_featsXsize()
df2 =  collate_featsXsize(df)
128
160
224
{% endraw %}

If we've already calculated everything just load it.

SANITY CHECK:

Just want to chack that we can we extract single features that match those we just calculated.

{% raw %}
query_image = "Shoes/Sneakers and Athletic Shoes/Nike/7716996.288224.jpg"

df2.loc[df2.path==query_image,['path','classes_md']]
path classes_md
27079 Shoes/Sneakers and Athletic Shoes/Nike/7716996.288224.jpg 27079
{% endraw %}

The DataBlock performed a number of processing steps to prepare the images for embedding into the MobileNet_v2 space (1280 vector). Lets confirm that we get the same image and MobileNet_v2 features.

That seemed to work well. I'll just wrap it in a simple function for now, though a FastAI Pipeline might work the best in the long run.

{% raw %}
{% endraw %} {% raw %}

load_and_prep_sneaker[source]

load_and_prep_sneaker(image_path, size=160, to_cuda=False)

input: expects a Path(), but string should work

output TensorImage ready to unsqueeze and "embed"

TODO: make this a Pipeline?

{% endraw %} {% raw %}
{% endraw %} {% raw %}

get_mnet_feature[source]

get_mnet_feature(mnetv2, t_image, to_cuda=False)

input: mnetv2 - our neutered & prepped MobileNet_v2 t_image - ImageTensor. probaby 3x224x224... but could be a batch to_cuda - send to GPU? default is CPU (to_cuda=False) output: features - output of mnetv2vector n-1280

{% endraw %} {% raw %}
mnetv2=model
query_image2 = '/home/ergonyc/Downloads/491212_01.jpg.jpeg'

query_t = load_and_prep_sneaker(path_images/query_image)

test_feats = get_mnet_feature(mnetv2,query_t)
test_feats.shape
torch.Size([1, 1280])
{% endraw %}

Now I have the "embeddings" of the database in the mobileNet_v2 output space. I can do a logistic regression on these vectors (should be identical to mapping these 1000 vectors to 4 categories (Part 3)) but I can also use an approximate KNN in this space to run the SneakerFinder tool.

next steps:

  • make KNN functions.. maybe aproximate KNN e.g. Annoy for speed. Or precalculate .
  • PCA / tSNE / UMAP the space with categories to visualize embedding
  • make widgets

Lets find the nearest neighbors as a proxy for "similar"

I'll start with a simple "gut" test, and point out that thre realy isn't a ground truth to refer to. Remember that the goal of all this is to find some shoes that someone will like, and we are using "similar" as the aproximation of human preference.

Lets use our previously calculated sneaker-features and inspect that the k- nearest neighbors in our embedding space are feel or look "similar".

Personally, I like Jordans so I chose this as my query_image: Sample Jordan

k-Nearest Neighbors

{% raw %}
def get_umap_reducer(latents):
    reducer = umap.UMAP(random_state=666)
    reducer.fit(latents)
    
    return reducer
{% endraw %} {% raw %}
df=df2
num_neighs = 5

knns = []
reducers = []
for i,sz in enumerate(IMG_SIZES):
    print(ABBR[sz])
    print(IMG_SIZES[sz])
    
    features = f"features_{ABBR[sz]}"
    print(features)
    
    db_feats = np.vstack(df[features].values)
    
    neighs = NearestNeighbors(n_neighbors=num_neighs) #add plus one in case image exists in database
    neighs.fit(db_feats)
    
    knns.append(neighs)
    
    reducer = get_umap_reducer(db_feats)
    reducers.append(reducer)
    
    
sm
128
features_sm
md
160
features_md
lg
224
features_lg
{% endraw %}

Lets take a quick look at the neighbors according to our list:

{% raw %}
neighs = knns[0]
distance, nn_index = neighs.kneighbors(test_feats, return_distance=True)    
{% endraw %} {% raw %}
dist = distance.tolist()[0] 

df.columns
Index(['CID', 'Category', 'path', 'path_and_file', 'Category1', 'Category2',
       'Filename', 'Sneakers', 'Boots', 'Shoes', 'Slippers', 'Adult', 'Gender',
       'train', 'test', 'validate', 't_t_v', 'classes_sm', 'features_sm',
       'classes_md', 'features_md', 'classes_lg', 'features_lg'],
      dtype='object')
{% endraw %} {% raw %}
paths = df[['path','classes_sm','classes_md','classes_lg']]
neighbors = paths.iloc[nn_index.tolist()[0]].copy()
{% endraw %} {% raw %}
images = [ PILImage.create(path_images/f) for f in neighbors.path] 
#PILImage.create(btn_upload.data[-1])
for im in images:
    display(im.to_thumb(IMG_SIZE,IMG_SIZE))
          
{% endraw %} {% raw %}
{% endraw %} {% raw %}

query_neighs[source]

query_neighs(q_feat, myneighs, data, root_path, show=True)

query feature: (vector) myneighs: fit knn object data: series or df containing "path" root_path: path to image files

{% endraw %} {% raw %}
similar_images = []
for i,sz in enumerate(IMG_SIZES):
    print(ABBR[sz])
    print(IMG_SIZES[sz])
    
    features = f"features_{ABBR[sz]}"
    print(features)

    query_t = load_and_prep_sneaker(path_images/query_image,IMG_SIZES[sz])
    query_f = get_mnet_feature(mnetv2,query_t)
    
    similar_images.append( query_neighs(query_f, knns[i], paths, path_images, show=False) )
     
    im = PILImage.create(path_images/query_image)
    display(im.to_thumb(IMG_SIZES[sz]))
sm
128
features_sm
md
160
features_md
lg
224
features_lg
{% endraw %} {% raw %}
{% endraw %} {% raw %}

plot_sneak_neighs[source]

plot_sneak_neighs(images)

function to plot matrix of image urls. image_urls[:,0] should be the query image

Args: images: list of lists

return: null saves image file to directory

{% endraw %} {% raw %}
plot_sneak_neighs(similar_images)
{% endraw %} {% raw %}
similar_images2 = []
for i,sz in enumerate(IMG_SIZES):
    print(ABBR[sz])
    print(IMG_SIZES[sz])
    
    features = f"features_{ABBR[sz]}"
    print(features)

    query_t = load_and_prep_sneaker(path_images/query_image2,IMG_SIZES[sz])
    query_f = get_mnet_feature(mnetv2,query_t)
    
    similar_images2.append( query_neighs(query_f, knns[i], paths, path_images, show=False) )

    im = PILImage.create(path_images/query_image2)
    display(im.to_thumb(IMG_SIZES[sz]))
    
plot_sneak_neighs(similar_images2)
sm
128
features_sm
md
160
features_md
lg
224
features_lg
{% endraw %}

visualize the embedding: PCA + UMAP

{% raw %}
df.columns
Index(['CID', 'Category', 'path', 'path_and_file', 'Category1', 'Category2',
       'Filename', 'Sneakers', 'Boots', 'Shoes', 'Slippers', 'Adult', 'Gender',
       'train', 'test', 'validate', 't_t_v', 'classes_sm', 'features_sm',
       'classes_md', 'features_md', 'classes_lg', 'features_lg', 'umap-one',
       'umap-two'],
      dtype='object')
{% endraw %} {% raw %}
# first simple PCA
pca = PCA(n_components=2)

for i,sz in enumerate(IMG_SIZES):
    print(ABBR[sz])
    print(IMG_SIZES[sz])
    
    features = f"features_{ABBR[sz]}"
    print(features)
    
    data = df[['Category',features]].copy()

    db_feats = np.vstack(data[features].values)

    # PCA
    pca_result = pca.fit_transform(db_feats)
    data['pca-one'] = pca_result[:,0]
    data['pca-two'] = pca_result[:,1] 
    print(f"Explained variation per principal component (sz{sz}): {pca.explained_variance_ratio_}")

    smpl_fac=.5
    #data=df.reindex(rndperm)

    plt.figure(figsize=(16,10))
    sns.scatterplot(
        x="pca-one",
        y="pca-two",
        hue="Category",
        palette=sns.color_palette("hls", 4),
        data=data.sample(frac=smpl_fac),
        legend="full",
        alpha=0.3
    )
    plt.savefig(f'PCA 2-D sz{sz}')
    plt.show()
    
    
    # get the UMAP on deck
    embedding = reducers[i].transform(db_feats)
    
    data['umap-one'] = embedding[:,0]
    data['umap-two'] = embedding[:,1] 

    plt.figure(figsize=(16,10))
    sns.scatterplot(
        x="umap-one",
        y="umap-two",
        hue="Category",
        palette=sns.color_palette("hls", 4),
        data=data.sample(frac=smpl_fac),
        legend="full",
        alpha=0.3
    )
    plt.gca().set_aspect('equal', 'datalim')
    plt.title(f'UMAP projection of mobileNetV2 embedded UT-Zappos data (sz{sz})', fontsize=24)
    plt.savefig('UMAP 2-D sz{sz}') 
    plt.show()
sm
128
features_sm
Explained variation per principal component (szsmall): [0.10363452 0.06866893]
md
160
features_md
Explained variation per principal component (szmedium): [0.10354973 0.07163638]
lg
224
features_lg
Explained variation per principal component (szlarge): [0.11955738 0.07852242]
{% endraw %} {% raw %}
{% endraw %} {% raw %}

get_umap_embedding[source]

get_umap_embedding(latents)

{% endraw %} {% raw %}
fn = df.path.values
type(db_feats)

snk2vec = dict(zip(fn,db_feats))

snk2vec[list(snk2vec.keys())[0]]

embedding = get_umap_embedding(db_feats)
snk2umap = dict(zip(fn,embedding))
  
{% endraw %} {% raw %}
# from sklearn.manifold import TSNE

# cov_mat =np.cov(vects.T)
# plt.figure(figsize=(10,10))
# sns.set(font_scale=1.5)
# hm = sns.heatmap(cov_mat,
#                  cbar=True,
#                  annot=True,
#                  square=True,
#                  fmt='.2f',
#                  annot_kws={'size': 12},
#                  cmap='coolwarm')
# plt.title('Covariance matrix showing correlation coefficients', size = 18)
# plt.tight_layout()
# plt.show()
{% endraw %}

now use widgets and make this into a "tool"

{% raw %}
{% endraw %} {% raw %}

on_click_find_similar[source]

on_click_find_similar(change)

{% endraw %} {% raw %}
# pca_result = pca.fit_transform(df['feats'].values.tolist())
# df['pca-one'] = pca_result[:,0]
# df['pca-two'] = pca_result[:,1] 
# df['pca-three'] = pca_result[:,2]
# print('Explained variation per principal component: {}'.format(pca.explained_variance_ratio_))


# #data=df.sample(frac=1.0)
# #data=df.reindex(rndperm)
# data = df

# #df_subset = df

# time_start = time.time()
# tsne = TSNE(n_components=2, verbose=1, perplexity=40, n_iter=300)
# tsne_results = tsne.fit_transform(db_feats)
# print('t-SNE done! Time elapsed: {} seconds'.format(time.time()-time_start))



# df['tsne-2d-one'] = tsne_results[:,0]
# df['tsne-2d-two'] = tsne_results[:,1]
# plt.figure(figsize=(16,10))
# sns.scatterplot(
#     x="tsne-2d-one", y="tsne-2d-two",
#     hue="CategoryDir",
#     palette=sns.color_palette("hls", 4),
#     data=df,
#     legend="full",
#     alpha=0.3
#)
{% endraw %}

logisic regression on the mobilnet_v2 features

{% raw %}
from sklearn.metrics import confusion_matrix
from seaborn import heatmap
from sklearn.linear_model import LogisticRegression
    
#Display Confusion Matrix
X_test = np.vstack(df[df.t_t_v=='test']['features_lg'])
y_test = np.vstack(df[df.t_t_v=='test']['Category']).flatten()

# use validate and train for training (no validation here)
X_train = np.vstack(df[df.train | df.validate]['features_lg'])
y_train = np.vstack(df[df.train | df.validate]['Category']).flatten()


clf_log = LogisticRegression(C = 1, multi_class='ovr', max_iter=2000, solver='lbfgs')
clf_log.fit(X_train, y_train)
log_score = clf_log.score(X_test, y_test)
log_ypred = clf_log.predict(X_test)

log_confusion_matrix = confusion_matrix(y_test, log_ypred)
print(log_confusion_matrix)

disp = heatmap(log_confusion_matrix, annot=True, linewidths=0.5, cmap='Blues')
plt.savefig('log_Matrix.png')


plt.figure(figsize=(16,16))


# Plot non-normalized confusion matrix
titles_options = [("Confusion matrix, without normalization", None),
                  ("Normalized confusion matrix", 'true')]
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-17-a63fcf7e249c> in <module>
      4 
      5 #Display Confusion Matrix
----> 6 X_test = np.vstack(df[df.t_t_v=='test']['features_lg'])
      7 y_test = np.vstack(df[df.t_t_v=='test']['Category']).flatten()
      8 

NameError: name 'df' is not defined
{% endraw %} {% raw %}
class_names = df.Category.unique()

from sklearn.metrics import plot_confusion_matrix

for title, normalize in titles_options:
    disp = plot_confusion_matrix(clf_log, X_test, y_test,
                                 display_labels=class_names,
                                 cmap=plt.cm.Blues,
                                 normalize=normalize)
    disp.ax_.set_title(title)

    print(title)
    print(disp.confusion_matrix)

plt.savefig('log_Matrix2.png')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-18-06a150b5b490> in <module>
----> 1 class_names = df.Category.unique()
      2 
      3 from sklearn.metrics import plot_confusion_matrix
      4 
      5 for title, normalize in titles_options:

NameError: name 'df' is not defined
{% endraw %} {% raw %}
 
{% endraw %} {% raw %}
from nbdev.export import notebook2script
notebook2script()
Converted 00_core.ipynb.
Converted 01_data.ipynb.
Converted 02a_model.ipynb.
Converted 03_model.ipynb.
Converted index.ipynb.
{% endraw %}